Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Riverine flooding events are expected to become increasingly severe in the coming decades due to climate change, resulting in an urgent need to build flood resilience in underserved areas of the country. West Virginia has some of the highest risk of flooding in the United States, which is often compounded by aging infrastructure and high levels of socioeconomic vulnerability. In June 2016, one storm caused flooding that killed twenty-three people, destroyed or damaged thousands of homes and businesses, and caused $1 billion in damages across the state. Some of the most affected towns have yet to fully recover. This mixed-methods community-engaged research project was the first systematic investigation of lessons learned from the 2016 floods in Greenbrier County, West Virginia, a place devastated by this disaster. Using a county-wide survey, focus groups, and participatory GIS (PGIS), this project resulted in the creation of community-informed geospatial products to communicate flood risk, as well as a set of community-identified recommendations for increasing resilience to future flood disasters. Findings offer critical insights for more effective flood response and recovery in West Virginia and other rural areas of the United States with high riverine flood risk.more » « lessFree, publicly-accessible full text available July 4, 2026
-
Semantic segmentation algorithms, such as UNet, that rely on convolutional neural network (CNN)-based architectures, due to their ability to capture local textures and spatial context, have shown promise for anthropogenic geomorphic feature extraction when using land surface parameters (LSPs) derived from digital terrain models (DTMs) as input predictor variables. However, the operationalization of these supervised classification methods is limited by a lack of large volumes of quality training data. This study explores the use of transfer learning, where information learned from another, and often much larger, dataset is used to potentially reduce the need for a large, problem-specific training dataset. Two anthropogenic geomorphic feature extraction problems are explored: the extraction of agricultural terraces and the mapping of surface coal mine reclamation-related valley fill faces. Light detection and ranging (LiDAR)-derived DTMs were used to generate LSPs. We developed custom transfer parameters by attempting to predict geomorphon-based landforms using a large dataset of digital terrain data provided by the United States Geological Survey’s 3D Elevation Program (3DEP). We also explored the use of pre-trained ImageNet parameters and initializing models using parameters learned from the other mapping task investigated. The geomorphon-based transfer learning resulted in the poorest performance while the ImageNet-based parameters generally improved performance in comparison to a random parameter initialization, even when the encoder was frozen or not trained. Transfer learning between the different geomorphic datasets offered minimal benefits. We suggest that pre-trained models developed using large, image-based datasets may be of value for anthropogenic geomorphic feature extraction from LSPs even given the data and task disparities. More specifically, ImageNet-based parameters should be considered as an initialization state for the encoder component of semantic segmentation architectures applied to anthropogenic geomorphic feature extraction even when using non-RGB image-based predictor variables, such as LSPs. The value of transfer learning between the different geomorphic mapping tasks may have been limited due to smaller sample sizes, which highlights the need for continued research in using unsupervised and semi-supervised learning methods, especially given the large volume of digital terrain data available, despite the lack of associated labels.more » « lessFree, publicly-accessible full text available December 1, 2025
-
Sun, Xiaoyong (Ed.)Convolutional neural network (CNN)-based deep learning (DL) methods have transformed the analysis of geospatial, Earth observation, and geophysical data due to their ability to model spatial context information at multiple scales. Such methods are especially applicable to pixel-level classification or semantic segmentation tasks. A variety of R packages have been developed for processing and analyzing geospatial data. However, there are currently no packages available for implementing geospatial DL in the R language and data science environment. This paper introduces the geodl R package, which supports pixel-level classification applied to a wide range of geospatial or Earth science data that can be represented as multidimensional arrays where each channel or band holds a predictor variable. geodl is built on the torch package, which supports the implementation of DL using the R and C++ languages without the need for installing a Python/PyTorch environment. This greatly simplifies the software environment needed to implement DL in R. Using geodl, geospatial raster-based data with varying numbers of bands, spatial resolutions, and coordinate reference systems are read and processed using the terra package, which makes use of C++ and allows for processing raster grids that are too large to fit into memory. Training loops are implemented with the luz package. The geodl package provides utility functions for creating raster masks or labels from vector-based geospatial data and image chips and associated masks from larger files and extents. It also defines a torch dataset subclass for geospatial data for use with torch dataloaders. UNet-based models are provided with a variety of optional ancillary modules or modifications. Common assessment metrics (i.e., overall accuracy, class-level recalls or producer’s accuracies, class-level precisions or user’s accuracies, and class-level F1-scores) are implemented along with a modified version of the unified focal loss framework, which allows for defining a variety of loss metrics using one consistent implementation and set of hyperparameters. Users can assess models using standard geospatial and remote sensing metrics and methods and use trained models to predict to large spatial extents. This paper introduces the geodl workflow, design philosophy, and goals for future development.more » « lessFree, publicly-accessible full text available December 5, 2025
-
Evaluating classification accuracy is a key component of the training and validation stages of thematic map production, and the choice of metric has profound implications for both the success of the training process and the reliability of the final accuracy assessment. We explore key considerations in selecting and interpreting loss and assessment metrics in the context of data imbalance, which arises when the classes have unequal proportions within the dataset or landscape being mapped. The challenges involved in calculating single, integrated measures that summarize classification success, especially for datasets with considerable data imbalance, have led to much confusion in the literature. This confusion arises from a range of issues, including a lack of clarity over the redundancy of some accuracy measures, the importance of calculating final accuracy from population-based statistics, the effects of class imbalance on accuracy statistics, and the differing roles of accuracy measures when used for training and final evaluation. In order to characterize classification success at the class level, users typically generate averages from the class-based measures. These averages are sometimes generated at the macro-level, by taking averages of the individual-class statistics, or at the micro-level, by aggregating values within a confusion matrix, and then, calculating the statistic. We show that the micro-averaged producer’s accuracy (recall), user’s accuracy (precision), and F1-score, as well as weighted macro-averaged statistics where the class prevalences are used as weights, are all equivalent to each other and to the overall accuracy, and thus, are redundant and should be avoided. Our experiment, using a variety of loss metrics for training, suggests that the choice of loss metric is not as complex as it might appear to be, despite the range of choices available, which include cross-entropy (CE), weighted CE, and micro- and macro-Dice. The highest, or close to highest, accuracies in our experiments were obtained by using CE loss for models trained with balanced data, and for models trained with imbalanced data, the highest accuracies were obtained by using weighted CE loss. We recommend that, since weighted CE loss used with balanced training is equivalent to CE, weighted CE loss is a good all-round choice. Although Dice loss is commonly suggested as an alternative to CE loss when classes are imbalanced, micro-averaged Dice is similar to overall accuracy, and thus, is particularly poor for training with imbalanced data. Furthermore, although macro-Dice resulted in models with high accuracy when the training used balanced data, when the training used imbalanced data, the accuracies were lower than for weighted CE. In summary, the significance of this paper lies in its provision of readers with an overview of accuracy and loss metric terminology, insight regarding the redundancy of some measures, and guidance regarding best practices.more » « less
-
The challenges inherent in field validation data, and real-world light detection and ranging (lidar) collections make it difficult to assess the best algorithms for using lidar to characterize forest stand volume. Here, we demonstrate the use of synthetic forest stands and simulated terrestrial laser scanning (TLS) for the purpose of evaluating which machine learning algorithms, scanning configurations, and feature spaces can best characterize forest stand volume. The random forest (RF) and support vector machine (SVM) algorithms generally outperformed k-nearest neighbor (kNN) for estimating plot-level vegetation volume regardless of the input feature space or number of scans. Also, the measures designed to characterize occlusion using spherical voxels generally provided higher predictive performance than measures that characterized the vertical distribution of returns using summary statistics by height bins. Given the difficulty of collecting a large number of scans to train models, and of collecting accurate and consistent field validation data, we argue that synthetic data offer an important means to parameterize models and determine appropriate sampling strategies.more » « less
-
ABSTRACT Long-term terrestrial ecosystem monitoring is a critical component of documenting outcomes of land management actions, assessing progress towards management objectives, and guiding realistic long-term ecological goals, all through repeated observation and measurement. Traditional monitoring methods have evolved for specific applications in forestry, ecology, and fire and fuels management. While successful monitoring programs have clear goals, trained expertise, and rigorous sampling protocols, new advances in technology and data management can help overcome the most common pitfalls in data quality and repeatability. This paper presents Terrestrial Laser Scanning (TLS), a specific form of LiDAR (Light Detection and Ranging), as an emerging sampling method that can complement and enhance existing monitoring methods. TLS captures in high resolution the 3D structure of a terrestrial ecosystem (forest, grassland, etc.), and is increasingly efficient and affordable (<$30,000). Integrating TLS into ecosystem monitoring can standardize data collection, improve efficiency, and reduce bias and error. Streamlined data processing pipelines can rigorously analyze TLS data and incorporate constant improvements to inform management decisions and planning. The approach described in this paper utilizes portable, push-button TLS equipment that, when calibrated with initial transect sampling, captures detailed forestry, fuels, and ecological features in less than 5 minutes per plot. We also introduce an interagency automated processing pipeline and dashboard viewer for instant, user-friendly analysis, and data retrieval of hundreds of metrics. Forest metrics and inventories produced with these methods offer effective decision-support data for managers to quantify landscape-scale conditions and respond with efficient action. This protocol further supports interagency compatibility for efficient natural resource monitoring across jurisdictional boundaries with uniform data, language, methods, and data analysis. With continued improvement of scanner capabilities and affordability, these data will shape the future of terrestrial ecosystem monitoring as an important means to address the increasingly fast pace of ecological change facing natural resource managers.more » « less
-
Many issues can reduce the reproducibility and replicability of deep learning (DL) research and application in remote sensing, including the complexity and customizability of architectures, variable model training and assessment processes and practice, inability to fully control random components of the modeling workflow, data leakage, computational demands, and the inherent nature of the process, which is complex, difficult to perform systematically, and challenging to fully document. This communication discusses key issues associated with convolutional neural network (CNN)-based DL in remote sensing for undertaking semantic segmentation, object detection, and instance segmentation tasks and offers suggestions for best practices for enhancing reproducibility and replicability and the subsequent utility of research results, proposed workflows, and generated data. We also highlight lingering issues and challenges facing researchers as they attempt to improve the reproducibility and replicability of their experiments.more » « less
-
Terrestrial laser scanning (TLS) data can offer a means to estimate subcanopy fuel characteristics to support site characterization, quantification of treatment or fire effects, and inform fire modeling. Using field and TLS data within the New Jersey Pinelands National Reserve (PNR), this study explores the impact of forest phenology and density of shrub height (i.e., shrub fuel bed depth) measurements on estimating average shrub heights at the plot-level using multiple linear regression and metrics derived from ground-classified and normalized point clouds. The results highlight the importance of shrub height sampling density when these data are used to train empirical models and characterize plot-level characteristics. We document larger prediction intervals (PIs), higher root mean square error (RMSE), and lower R-squared with reduction in the number of randomly selected field reference samples available within each plot. At least 10 random shrub heights collected in situ were needed to produce accurate and precise predictions, while 20 samples were ideal. Additionally, metrics derived from leaf-on TLS data generally provided more accurate and precise predictions than those calculated from leaf-off data within the study plots and landscape. This study highlights the importance of reference data sampling density and design and data characteristics when data will be used to train empirical models for extrapolation to new sites or plots.more » « less
-
Fire-prone landscapes found throughout the world are increasingly managed with prescribed fire for a variety of objectives. These frequent low-intensity fires directly impact lower forest strata, and thus estimating surface fuels or understory vegetation is essential for planning, evaluating, and monitoring management strategies and studying fire behavior and effects. Traditional fuel estimation methods can be applied to stand-level and canopy fuel loading; however, local-scale understory biomass remains challenging because of complex within-stand heterogeneity and fast recovery post-fire. Previous studies have demonstrated how single location terrestrial laser scanning (TLS) can be used to estimate plot-level vegetation characteristics and the impacts of prescribed fire. To build upon this methodology, co-located single TLS scans and physical biomass measurements were used to generate linear models for predicting understory vegetation and fuel biomass, as well as consumption by fire in a southeastern U.S. pineland. A variable selection method was used to select the six most important TLS-derived structural metrics for each linear model, where the model fit ranged in R2 from 0.61 to 0.74. This study highlights prospects for efficiently estimating vegetation and fuel characteristics that are relevant to prescribed burning via the integration of a single-scan TLS method that is adaptable by managers and relevant for coupled fire–atmosphere models.more » « less
-
Land-surface parameters derived from digital land surface models (DLSMs) (for example, slope, surface curvature, topographic position, topographic roughness, aspect, heat load index, and topographic moisture index) can serve as key predictor variables in a wide variety of mapping and modeling tasks relating to geomorphic processes, landform delineation, ecological and habitat characterization, and geohazard, soil, wetland, and general thematic mapping and modeling. However, selecting features from the large number of potential derivatives that may be predictive for a specific feature or process can be complicated, and existing literature may offer contradictory or incomplete guidance. The availability of multiple data sources and the need to define moving window shapes, sizes, and cell weightings further complicate selecting and optimizing the feature space. This review focuses on the calculation and use of DLSM parameters for empirical spatial predictive modeling applications, which rely on training data and explanatory variables to make predictions of landscape features and processes over a defined geographic extent. The target audience for this review is researchers and analysts undertaking predictive modeling tasks that make use of the most widely used terrain variables. To outline best practices and highlight future research needs, we review a range of land-surface parameters relating to steepness, local relief, rugosity, slope orientation, solar insolation, and moisture and characterize their relationship to geomorphic processes. We then discuss important considerations when selecting such parameters for predictive mapping and modeling tasks to assist analysts in answering two critical questions: What landscape conditions or processes does a given measure characterize? How might a particular metric relate to the phenomenon or features being mapped, modeled, or studied? We recommend the use of landscape- and problem-specific pilot studies to answer, to the extent possible, these questions for potential features of interest in a mapping or modeling task. We describe existing techniques to reduce the size of the feature space using feature selection and feature reduction methods, assess the importance or contribution of specific metrics, and parameterize moving windows or characterize the landscape at varying scales using alternative methods while highlighting strengths, drawbacks, and knowledge gaps for specific techniques. Recent developments, such as explainable machine learning and convolutional neural network (CNN)-based deep learning, may guide and/or minimize the need for feature space engineering and ease the use of DLSMs in predictive modeling tasks.more » « less
An official website of the United States government
